The multivariate Watson distribution: Maximum-likelihood estimation and other aspects

نویسندگان

  • Suvrit Sra
  • Dmitrii Karp
چکیده

This paper studies fundamental aspects of modelling data using multivariate Watson distributions. Although these distributions are natural for modelling axially symmetric data (i.e., unit vectors where ±x are equivalent), for high-dimensions using them can be difficult—largely because for Watson distributions even basic tasks such as maximumlikelihood are numerically challenging. To tackle the numerical difficulties some approximations have been derived. But these are either grossly inaccurate in high-dimensions [K.V. Mardia, P. Jupp, Directional Statistics, second ed., John Wiley & Sons, 2000] or when reasonably accurate [A. Bijral,M. Breitenbach, G.Z. Grudic,Mixture ofWatson distributions: a generative model for hyperspherical embeddings, in: Artificial Intelligence and Statistics, AISTATS 2007, 2007, pp. 35–42], they lack theoretical justification.We derive new approximations to themaximum-likelihood estimates; our approximations are theoretically welldefined, numerically accurate, and easy to compute.We build on our parameter estimation and discuss mixture-modelling withWatson distributions; here we uncover a hitherto unknown connection to the ‘‘diametrical clustering’’ algorithm of Dhillon et al. [I.S. Dhillon, E.M. Marcotte, U. Roshan, Diametrical clustering for identifying anticorrelated gene clusters, Bioinformatics 19 (13) (2003) 1612–1619]. © 2012 Elsevier Inc. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of estimation methods for parameters of the probability functions in tree diameter distribution modeling

One of the most commonly used statistical models for characterizing the variations of tree diameter at breast height is Weibull distribution. The usual approach for estimating parameters of a statistical model is the maximum likelihood estimation (likelihood method). Usually, this works based on iterative algorithms such as Newton-Raphson. However, the efficiency of the likelihood method is not...

متن کامل

Value at Risk Estimation using the Kappa Distribution with Application to Insurance Data

The heavy tailed distributions have mostly been used for modeling the financial data. The kappa distribution has higher peak and heavier tail than the normal distribution. In this paper, we consider the estimation of the three unknown parameters of a Kappa distribution for evaluating the value at risk measure. The value at risk (VaR) as a quantile of a distribution is one of the import...

متن کامل

Step change point estimation in the multivariate-attribute process variability using artificial neural networks and maximum likelihood estimation

In some statistical process control applications, the combination of both variable and attribute quality characteristics which are correlated represents the quality of the product or the process. In such processes, identification the time of manifesting the out-of-control states can help the quality engineers to eliminate the assignable causes through proper corrective actions. In this paper, f...

متن کامل

Hyperbolic Cosine Log-Logistic Distribution and Estimation of Its Parameters by Using Maximum Likelihood Bayesian and Bootstrap Methods

‎In this paper‎, ‎a new probability distribution‎, ‎based on the family of hyperbolic cosine distributions is proposed and its various statistical and reliability characteristics are investigated‎. ‎The new category of HCF distributions is obtained by combining a baseline F distribution with the hyperbolic cosine function‎. ‎Based on the base log-logistics distribution‎, ‎we introduce a new di...

متن کامل

Estimation of Parameters for an Extended Generalized Half Logistic Distribution Based on Complete and Censored Data

This paper considers an Extended Generalized Half Logistic distribution. We derive some properties of this distribution and then we discuss estimation of the distribution parameters by the methods of moments, maximum likelihood and the new method of minimum spacing distance estimator based on complete data. Also, maximum likelihood equations for estimating the parameters based on Type-I and Typ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Multivariate Analysis

دوره 114  شماره 

صفحات  -

تاریخ انتشار 2013